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Abstract A fundamental problem of Internet traffic engineering is band- 
width estimation: determining the bandwidth (bits per second) required to 
carry traffic with a specific bit rate (bits per second) offered to an Internet link 
and satisfy quality-of-service requirements. The traffic is packets of varying 
sizes that arrive for transmission on the link. Packets can queue up and are 
dropped if the queue size (bits) is bigger than the size of the buffer (bits) 
for the queue. For the predominant traffic on the Internet, best-effort traffic, 
quality metrics are the packet loss (fraction of lost packets), a queueing de- 
lay (seconds) and the delay probability (probability of a packet exceeding 
the delay). This article presents an introduction to bandwidth estimation and 
a solution to the problem of best-effort traffic for the case where the qual- 
ity criteria specify negligible packet loss. The solution is a simple statistical 
model: (1) a formula for the bandwidth as a function of the delay, the delay 
probability, the traffic bit rate and the mean number of active host-pair con- 
nections of the traffic and (2) a random error term. The model is built and 
validated using queueing theory and extensive empirical study; it is valid for 
traffic with 64 host-pair connections or more, which is about 1 megabit/s 
of traffic. The model provides for Internet best-effort traffic what the Erlang 
delay formula provides for queueing systems with Poisson arrivals and i.i.d. 
exponential service times. 

Key words and phrases: Queueing, Erlang delay formula, nonlinear time 
series, long-range dependence, QoS, statistical multiplexing, Internet traffic, 
capacity planning. 



1. INTRODUCTION: CONTENTS OF THE PAPER 

The Internet is a worldwide computer network. 
At any given moment, a vast number of pairs of hosts 
are transferring files to one other. Each transferred file 
is broken up into packets that are sent along a path 
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across the Internet that consists of links and nodes. 
The first node is the sending host: a packet exits the 
host and travels along a link (fiber, wire, cable or air) 
to a first router node, then over a link to a second router 
node and so forth until the last router sends the packet 
to a receiving host node over a final link. 

The packet traffic arriving for transmission on an 
Internet link is a stream: a sequence of packets with 
arrival times (seconds) and sizes (bytes or bits). The 
packets come from pairs of hosts using the link for 
their transfers; that is, the link lies on the path from 
one host to another for each of a collection of pairs of 
hosts. When a packet arrives for transmission on a link, 
it enters a buffer (bits) where it must wait if there are 
other packets waiting for transmission or if a packet is 
in service, that is, in the process of moving out of the 
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buffer onto the link. If the buffer is full, the packet is 
dropped. 

A link has a bandwidth (bits per second), the rate 
at which the bits of a packet are put on the link. Over 
an interval of time during which the traffic is station- 
ary, the packets arrive for transmission at a certain 
rate — the traffic bit rate (bits per second), which is 
defined formally to be the mean of the packet sizes 
(bits) divided by the mean packet interarrival time (sec- 
onds); this is approximately the mean number of arriv- 
ing bits over the interval divided by the interval length 
(seconds). Over the interval there is a mean simultane- 
ous active connection load, which is the mean number 
of source-destination pairs of hosts actively sending 
packets over the link. The utilization of the link is the 
traffic bit rate divided by the bandwidth; it measures 
the traffic rate relative to the capacity of the link. 

This article presents results on a fundamental prob- 
lem of engineering the Internet. What link bandwidth 
is needed to accommodate traffic with a certain bit rate 
and ensure that the transmission on the link maintains 
quality-of-service (QoS) criteria? The QoS bandwidth 
must be found for every link set up on the Internet, 
from the low-bandwidth links connected to the com- 
puters of home users to the high-bandwidth links of a 
major Internet service provider. Our approach to solv- 
ing the bandwidth estimation problem is to use queue- 
ing theory and queueing simulations to build a model 
for the QoS bandwidth. The traffic inputs are live 
streams from measurements of live links and synthetic 
streams from statistical models for traffic streams. 

Section 2 describes transmission control protocol/ 
Internet protocol (TCP/IP) transmission technology, 
which governs almost all computer networking today; 
for example, the networks of Internet service providers, 
universities, companies and homes. Section 2 also de- 
scribes the buffer queueing process and its effect on the 
QoS of file transfer. 

Section 3 formulates the particular version of the 
bandwidth estimation problem that is addressed here, 
discusses why the statistical properties of the packet 
streams are so critical to bandwidth estimation and out- 
lines how we use queueing simulations to study the 
problem. We study best-effort Internet traffic streams 
because they are the predominant type of traffic on 
Internet links today. The QoS criteria for best-effort 
streams are the packet loss (fraction of lost packets), 
the queueing delay (seconds) and the delay probabil- 
ity (probability of a packet exceeding the delay). We 
suppose that the link packet loss is negligible and find 
the QoS bandwidth required for a packet stream of a 



certain load that satisfies the delay and the delay prob- 
ability. 

Section 4 describes fractional sum-difference (FSD) 
time series models, which are used to generate the syn- 
thetic streams for the queueing simulations. The FSD 
models — a new class of non-Gaussian, long-range de- 
pendent time series models — provide excellent fits to 
packet size time series and to packet interarrival time 
series. The validation of the FSD models is critical to 
this study. The validity of our solution to the bandwidth 
estimation problem depends on having traffic inputs to 
the queueing that reproduce the statistical properties of 
best-effort traffic. Of course, the live data have these 
properties, but we need assurance that the synthetic 
data do as well. 

Section 5 describes the live packet arrivals and sizes, 
and the synthetic packet arrivals and sizes that are gen- 
erated by the FSD models. Section 6 gives the details 
of the simulations and the resulting delay data: values 
of the QoS bandwidth, delay, delay probability, mean 
number of active host-pair connections of the traffic 
and traffic bit rate. 

Model building, based on the simulation delay data 
and on queueing theory, begins in Section 7. To do the 
model building and diagnostics, we exploit the struc- 
ture of the delay data — utilizations for all combinations 
of delay and delay probability for each stream, live 
or synthetic. We develop an initial model that relates, 
for each stream, the QoS utilization (bit rate divided 
by the QoS bandwidth) to the delay and delay proba- 
bility. We find a transformation for the utilization for 
which the functional dependence on the delay and de- 
lay probability does not change with the stream. There 
is also an additive stream coefficient that varies across 
streams, characterizing the statistical properties of each 
stream. This stream-coefficient delay model cannot be 
used for bandwidth estimation because the stream co- 
efficient is not known in practice. 

Next we add two variables to the model that mea- 
sure the statistical properties of the streams and that 
can be specified or measured in practice — the traffic bit 
rate and the number of simultaneous active host-pair 
connections on the link — and drop the stream coeffi- 
cients. In effect we have modeled the coefficients. The 
result is the best-effort delay model: a best-effort delay 
formula for the utilization as a function of (1) the de- 
lay, (2) the delay probability, (3) the traffic bit rate and 
(4) the mean number of active host-pair connections of 
the traffic, plus a random error term. 

Section 8 presents a method for bandwidth estima- 
tion that starts with the value from the best-effort delay 
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formula and then uses the error distribution of the best- 
effort delay model to find a tolerance interval whose 
minimum value provides a conservative estimate with 
a low probability of being too small 

Section 9 discusses previous work on bandwidth es- 
timation and how it differs from the work here. Sec- 
tion 10 is an extended abstract. Readers who seek just 
results can proceed to this section; those not famil- 
iar with Internet engineering technology might want to 
read Sections 2 and 3 first. 

The following notation is used throughout the arti- 
cle: 

Packet stream 

v arrival numbers (number): v = 1 is the first 

packet, v = 2 is the second packet, etc. 
a v arrival times (seconds) 

t v interarrival times (seconds): t v = a u +i — a v 
q v sizes (bytes or bits). 

Traffic load 

c mean number of simultaneous active connec- 
tions (number) 
x traffic bit rate (bits per second) 
y p connection packet rate (packets per second 

per connection) 
Yt, connection bit rate (bits per second per con- 
nection). 
Bandwidth 

ft bandwidth (bits per second) 
u utilization (fraction) r Jf$ . 

Queueing 

8 packet delay (seconds) 
co delay probability (fraction). 

2. INTERNET TECHNOLOGY 

The Internet is a computer network over which 
a pair of host computers can transfer one or more files 
(Stevens, 1994). Consider the downloading of a Web 
page, which is often made up of more than one file. 
One host — the client — sends a request file to start the 
downloading of the page. Another host — the server — 
receives the request file and sends back a first response 
file. This process continues until all of the response 
files necessary to display the page are sent. The client 
passes the received response files to a browser such as 
Netscape, which then displays the page on the screen. 
This section gives information about some of the Inter- 
net engineering protocols involved in such file transfer. 



2.1 Packet Communications 

When a file is sent, it is broken up into packets whose 
sizes are 1460 bytes or less. The packets are sent from 
the source host to the destination host, where they are 
reassembled to form the original file. They travel along 
a path across the Internet that consists of transmission 
links and routers. The source computer is connected 
to a first router by a transmission link, the first router 
is connected to a second router by another transmis- 
sion link and so forth. A router has input links and 
output links. When it receives a packet from one of 
its input links, it reads the destination address on the 
packet, determines which of the routers connected to it 
by output links gets the packet and sends out the packet 
over the output link connected to that router. The flight 
across the Internet ends when a final router receives the 
packet on one of its input links and sends the packet to 
the destination computer over one of its output links. 

The two hosts establish a connection to carry out one 
or more file transfers. The connection consists of soft- 
ware running on the two computers that manage the 
sending and receiving of packets. The software exe- 
cutes an Internet transport protocol, a detailed prescrip- 
tion for how the sending and receiving should work. 
The two major transport protocols are the user data- 
gram protocol (UDP) and the transmission control pro- 
tocol (TCP). UDP just sends the packets out. With TCP, 
the two hosts exchange control packets that manage the 
connection. TCP opens the connection, closes it, re- 
transmits packets not received by the destination and 
controls the rate at which packets are sent based on 
the amount of retransmission that occurs. The transport 
software adds a header to each packet that contains in- 
formation about the file transfer. The header is 20 bytes 
for TCP and 8 bytes for UDP. 

Software running on the two hosts implements an- 
other network protocol, the Internet protocol (IP) that 
manages the involvement of the two hosts in rout- 
ing a packet across the Internet. The software adds 
a 20-byte IP header to the packet with information 
needed for the routing such as the source host IP ad- 
dress and the destination host IP address. IP epito- 
mizes the conceptual framework that underlies Internet 
packet transmission technology. The networks that 
make up the Internet — for example, the networks of 
Internet service providers, universities, companies and 
homes — are often referred to as IP networks, although 
today it is unnecessary because almost all computer 
networking is IP, a public-domain technology that 
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defeated all other contenders, including the propri- 
etary systems of big computer and communications 
companies. 

2.2 Link Bandwidth 

The links along the path between the source and the 
destination hosts each have a bandwidth in bits per 
second. The bandwidth refers to the speed at which the 
bits of a packet are put on the link by a computer or 
router. For the link connecting a home computer to a 
first router, ft might be 56 kilobits/s if the computer 
uses an internal modem or 1.5 megabits/s if there is 
a broadband connection, a cable or DSL link. The link 
connecting a university computer to a first router might 
be 10 megabits/s, 100 megabits/s or 1 gigabit/s. The 
links on the core network of a major Internet service 
provider have a wide range of bandwidths; typical val- 
ues range from 45 megabits/s to 10 gigabits/s. For a 
40-byte packet, which is 320 bits, it takes 5.714 ms 
to put the packet on a 56-kilobit/s link and takes 
0.032 ^s to put it on a 10-gigabits/s link, which is 
about 180,000 times faster. Once a bit is put on the 
link, it travels down the link at the speed of light. 

2.3 Active Connections, Statistical Multiplexing 
and Measures of Traffic Loads 

At any given moment, an Internet link has a number 
of simultaneous active connections; this is the number 
of pairs of computers connected with one another that 
are sending packets over the link. The packets of the 
different connections are intermingled on the link; for 
example, if there are three active connections, the ar- 
rival order of 10 consecutive packets by connection 
number might be 1, 1, 2, 3, 1, 1, 3, 3, 2 and 3. The inter- 
mingling is referred to as statistical multiplexing. On a 
link that connects a local network with about 500 users 
there might be 300 active connections during a peak 
period. On the core link of an Internet service provider 
there might be 60,000 active connections. 

During an interval of time when the traffic is station- 
ary, there are a mean number of active connections c 
and a traffic bit rate x in bits per second. Let /i( f ) in 
seconds be the mean packet interarrival time and let 
li( q ) in bits be the mean packet size. Then the packet 
arrival rate per connection is y p = c ~ X V<^t) P ac ^ets/s 
per connection. The bit rate per connection is yb = 
fjL(q)C~ l fJL( t * = rcT 1 bits/s per connection. The vari- 
ables y p and yb measure the average host-to-host speed 
of Internet connections (e.g., the rate at which the file 
of a page is downloaded) for the pairs of hosts that use 
the link. 



The bit rate of all traffic on the link is r = cy^. Of 
course, r < ft because bits cannot be put on the link 
at a rate faster than the bandwidth. A larger traffic 
bit rate r requires a larger bandwidth p. Let us re- 
turn to the path across the Internet for the Web page 
download discussed earlier. Starting from the link that 
connects the client computer to the Internet and pro- 
ceeding though the links, t tends to increase and, 
therefore, so does p. We start with a low-bandwidth 
link, say 1 .5 megabits/s, then move to a link at the edge 
of a service provider network, say 156 megabits/s, 
and then move to the core links of the provider, say 
10 gigabits/s. As we continue further, we move from 
the core to the service provider edge to a link con- 
nected to the destination computer, so t and P tend 
to decrease. 

2.4 Queueing, Best-Effort Traffic and QoS 

A packet arriving for transmission on a link is pre- 
sented with a queueing mechanism. The service time 
for a packet is the time it takes to put the packet on 
the link, which is the packet size divided by the band- 
width p. If there are any packets whose transmission is 
not completed, then the packet must wait until these 
packets are fully transmitted before its transmission 
can begin. This is the queueing delay. The packets 
waiting for transmission are stored in a buffer, a region 
in the memory of the computer or router. The buffer 
has a size. If a packet arrives and the buffer is full, 
then the packet is dropped. As we will see, the arrival 
process for packets on a link is long-range dependent: 
at low loads, the traffic is very bursty, but as the load 
increases, the burstiness dissipates. For a fixed r and p y 
bursty traffic results in a much larger queue-height dis- 
tribution than traffic with Poisson arrivals. 

The predominant protocol for managing file trans- 
fers, TCP, changes the rate at which it sends packets 
with file contents. TCP increases the rate when all goes 
well, but reduces the rate when a destination computer 
indicates that a packet has not been received; the as- 
sumption is that congestion somewhere on the path 
has led to a buffer overflow and the rate reduction is 
needed to help relieve the congestion. In other words, 
TCP is closed loop because there is feedback; UDP 
is not aware of dropped packets and does not respond 
to them. 

When traffic is sent across the Internet using TCP or 
UDP and this queueing mechanism, with no attempt to 
add additional protocol features to improve QoS, then 
the traffic is referred to as best effort. The IP networks 
are a best-effort system because the standard protocols 
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make an effort to get packets to their destination, but 
packets can be delayed, lost, or delivered out of order. 
Queueing delay and packet drops degrade the QoS of 
best-effort traffic. For example, for Web page transfers, 
the result is a longer wait by the user, partly because 
the packets sit in the queue and partly because TCP 
reduces its sending rate when retransmission occurs. 
Best-effort traffic contrasts with priority traffic, which 
when it arrives at a router, goes in front of best-effort 
packets. Packets for voice traffic over the Internet are 
often given priority. 

3. THE BANDWIDTH ESTIMATION PROBLEM: 
FORMULATION AND STREAM 
STATISTICAL PROPERTIES 

3.1 Formulation 

Poor QoS that results from delays and drops on an 
Internet link can be improved by increasing the link 
bandwidth p. The service time decreases, so if the traf- 
fic rate x remains fixed, the queueing delay distribution 
decreases, and delay and loss are reduced. Loss and de- 
lay are also affected by the buffer size; the larger the 
buffer size, the fewer the drops, but then the queueing 
delay has the potential to increase because the maxi- 
mum queueing delay is the buffer size divided by p. 

The bandwidth estimation problem is to choose p to 
satisfy QoS criteria. The resulting value of P is the QoS 
bandwidth. The QoS utilization is the value of u = x/P 
that corresponds to the QoS bandwidth. When a local 
network, such as a company or university, purchases 
bandwidth from an Internet service provider, a decision 
on p must be made. When an Internet service provider 
designs its network, it must choose ft for each of its 
links. The decision must be based on the traffic load 
and QoS criteria. 

Here we address the bandwidth estimation problem 
specifically for links with best-effort traffic. We take 
the QoS criteria to be delay and loss. For delay we 
use two metrics: a delay 8 and the delay probability <y, 
the probability that a packet exceeds the delay. For loss 
we suppose that the decision has been made to choose 
a buffer size large enough that drops will be negligi- 
ble. This is, for example, consistent with the current 
practice of service providers on their core links Iyer, 
Bhattacharyya, Taft and Diot (2003). Of course, a large 
buffer size allows the possibility of a large delay, but 
setting QoS values for S and co allows us to control de- 
lay probabilistically. The alternative is to use the buffer 
size as a hard limit on delay, but because dropped pack- 
ets are an extreme remedy that causes more serious 



degradations of QoS, it is preferable to separate loss 
and delay control, using the softer probabilistic control 
for delay. Stipulating that packet loss is negligible on 
the link means that for a connection that uses the link, 
another link is the loss bottleneck; that is, if packets of 
the connection are dropped, it will be on another link. It 
also means that TCP feedback can be ignored in study- 
ing the bandwidth estimation problem. 

3.2 Packet Stream Statistical Properties 

A packet stream consists of a sequence of arriving 
packets, each with a size. Let v be the arrival number: 
v = 1 is the first packet, v = 2 is the second packet and 
so forth. Let a v be the arrival times, let t v = a v +\ — a v 
be the interarrival times and let q v be the size of the 
packet arriving at time a v . The statistical properties of 
the packet stream can be described by the statistical 
properties of t v and q v as time series in u. 

The QoS bandwidth for a packet stream depends 
critically on the statistical properties of t v and q v . 
Directly, the bandwidth depends on the queue-length 
time process, but the queue-length time process de- 
pends critically on the stream statistical properties. 
Here we consider best-effort traffic. It has persis- 
tent, long-range dependent t v and q v (Ribeiro, Riedi, 
Crouse and Baraniuk, 1999; Gao and Rubin, 2001; 
Cao, Cleveland, Lin and Sun, 2001). Persistent, long- 
range dependent t v and q v have dramatically larger 
queue-size distributions than those for independent 
t v and q v (Konstantopoulos and Lin, 1996; Erramilli, 
Narayan and Willinger, 1996; Cao, Cleveland, Lin and 
Sun, 2001). The long-range dependent traffic is burstier 
than the independent traffic, so the QoS utilization is 
smaller because more headroom is needed to allow for 
the bursts. This finding demonstrates quite clearly the 
impact of the statistical properties, but a corollary of 
the finding is that the results here are limited to best- 
effort traffic streams (or any other streams with similar 
statistical properties). Results for other types of traf- 
fic with quite different statistical properties (e.g., links 
carrying voice traffic using current Internet protocols) 
are different. 

Best-effort traffic is not homogeneous. As the traffic 
connection load c increases, the arrivals tend toward 
Poisson and the sizes tend toward independent (Cao, 
Cleveland, Lin and Sun, 2003; Cao and Ramanan, 
2002). The reason for this is the increased statistical 
multiplexing of packets from different connections; the 
intermingling of the packets of different connections is 
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a randomization process that breaks down the corre- 
lation of the streams. In other words, the long-range 
dependence dissipates. This means that in our band- 
width estimation study, we can expect a changing es- 
timation mechanism as c increases. In particular, we 
expect multiplexing gains, that is, greater utilization 
due to the reduction in dependence. Because of the 
change in properties with c, we must be sure to study 
streams with a wide range of values of c. 

4. FSD TIME SERIES MODELS FOR PACKET 
ARRIVALS AND SIZES 

This section presents FSD time series models, a new 
class of non-Gaussian, long-range dependent models 
(Cao, Cleveland, Lin and Sun, 2003; Cao, Cleveland 
and Sun, 2004). The two independent packet-stream 
time series — the interarrivals t v and the sizes q v — are 
each modeled by an FSD model, and the models are 
used to generate synthetic best-effort traffic streams for 
the queueing simulations in our study. 

There are a number of known properties of t v and q v 
that have to be accommodated by the FSD models. 
First, these two time series are long-range dependent. 
This is associated with the important discovery of 
long-range dependence of packet arrival counts and 
of packet byte counts in successive equal-length inter- 
vals of time, such as 10 ms (Leland, Taqqu, Willinger 
and Wilson, 1994; Paxson and Floyd, 1995). Second, 
t v and q v are non-Gaussian. Complex non-Gaussian 
behavior was demonstrated clearly in important work 
that showed that highly nonlinear multiplicative mul- 
tifractal models can account for the statistical proper- 
ties of t v and q v (Riedi, Crouse, Ribeiro and Baraniuk, 
1999; Gao and Rubin, 2001). These nonparametric 
models utilize many coefficients and a complex cas- 
cade structure to explain these properties. Third, the 
statistical properties of the two time series change as 
c increases (Cao, Cleveland, Lin and Sun, 2003). The 
arrivals tend toward Poisson and the sizes tend toward 
independent; there are always long-range dependent 
components present in the series, but the contributions 
of the components to the variances of the series go to 
zero. 

4.1 Solving the Non-Gaussian Challenge 

The challenge in modeling t v and q v is their com- 
bined non-Gaussian and long-range dependent prop- 
erties, a difficult combination that does not, without 
a simplifying approach, allow parsimonious character- 
ization. We discovered that monotone nonlinear trans- 
formations of the interarrivals and sizes are very well 



fitted by parsimonious Gaussian time series, that is, 
a very simple class of fractional autoregressive inte- 
grated moving average (ARIMA) models (Hosking, 
1981) with a small number of parameters. In other 
words, the transformations and the Gaussian mod- 
els account for the complex multifractal properties of 
t v and q v in a simple way. 

4.2 The FSD Model Class 

Suppose x v for v = 1,2,... is a stationary time 
series with marginal cumulative distribution func- 
tion F(x\ </>), where <p is a vector of unknown para- 
meters. Let x* = H(x v ; 0) be a transformation of x v 
such that the marginal distribution of x* is normal 
with mean 0 and variance 1. We have H{x v \4>) — 
G _1 (F(jc; </>)), where G(z) is the cumulative distribu- 
tion function of a normal random variable with mean 0 
and variance 1 . Next we suppose x* is a Gaussian time 
series and call x* the Gaussian image of x v . 

Suppose jc* has the form 

xt = y/l-Os v + V0n v , 

where s v and n v are independent of one another and 
each has mean 0 and variance 1, n v is Gaussian white 
noise, that is, an independent time series and s v is a 
Gaussian fractional ARIMA (Hosking, 1981) 

(I -B) d S v =£ v 

where Bs v = s v -\, 0 < d < 0.5 and £ v is Gaussian 
white noise with mean 0 and variance 



(i-d)r 2 (i-«o 



2T{\-2d) 

The above time series x v is a fractional sum-dif- 
ference (FSD) time seri es. Its Gaussian image, jc*, 
has two components: Vl — 9s v is the long-range- 
dependent (lrd) component, which has variance 1 — 0, 
and sfd is the white-noise component, which has vari- 
ance 0. 

Let p x * (/) be the power spectrum of the x*. Then 



Px*(f) = (l-0)cr l 



2 4C0S 2 (7T/) 



+ 9 



(4sin 2 (7T/))* 

for 0 < / < 0.5. As / 0.5, p x *(f) decreases 
monotonically to 9. As / 0, p x *(f) goes to infinity 
like sm~ 2d (7if) ~ Z" 2 ^, one outcome of long-range 
dependence. For nonnegative integer lags k, let r x *(k), 
r s (k) and r n (k) be the autocovariance functions of x*, 
s v and n V9 respectively. Because the three series have 
variance 1, the autocovariance functions are also the 
autocorrelation functions. r s (k) is positive and falls off 
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like k 2d ~ ] as k increases, another outcome of long- 
range dependence. For k > 0, r n (k) = 0 and 

= (l-*)r,(*). 

As 9 -* 1, x* goes to white noise: p x *(f) 1 and 
J** (&) -> 0 for it > 0. The changes in the autocovari- 
ance function and power spectrum are instructive. As 
6 gets closer to 1, the rise of p x *(f) near / = 0 is al- 
ways to order f~ 2d and the rate of decay of r x *(k) for 
large k is always k 2d ~ ] , but the ascent of p x * (/) at the 
origin begins closer and closer to / = 0 and the r x * (k) 
get uniformly smaller by the multiplicative factor 1 — 0 . 

4.3 Marginal Distributions of q v and t v 

We model the marginal distribution of t v by a 
Weibull with shape k and scale a, a family with two 
unknown parameters. Estimates of k are almost always 
less than 1. The Weibull provides an excellent approxi- 
mation of the sample marginal distribution of the t v ex- 
cept that the smallest 3-5% of the sample distribution 
is truncated to a nearly constant value due to certain 
network transmission properties. 

The marginal distribution of q v is modeled as fol- 
lows. While packets less than 40 bytes can occur, it 
is sufficiently rare that we ignore this and suppose 
40 < q v < 1500. First, we provide for A atoms at sizes 
0 1 (5) ,...,^ ) such as 40, 512, 576 and 1500 bytes, 
which are commonly occurring sizes; the atom prob- 
abilities are <l>\ a \ . . . , 4>^\ For the remaining sizes, 
we divided the interval [40, 1500] bytes into C inter- 
vals using C - 1 distinct breakpoints <p\ b \ . . . , 4>c-\ 
with values that are greater than 40 bytes and less than 
1500 bytes. For each of the C intervals, the size distrib- 
ution is uniformly distributed (excluding the atoms) in 
the interval; the total probabilities for the intervals are 
<(>\ l \ . . . , 4>%\ Typically, with just three atoms at 40, 
576 and 1500 bytes, and with just two breakpoints at 
50 and 200 bytes, we get an excellent approximation 
of the marginal distribution. 

4.4 Gaussian Images of q v and t v 

The transformed time series /* and q* appear to be 
quite close to Gaussian processes. Some small amount 
of non-Gaussian behavior is still present, but it is mi- 
nor. The autocorrelation structure of these Gaussian 
images is very well fitted by the FSD autocorrelation 
structure. 

The parameters of the FSD model are the following: 

• q v marginal distribution: A atom probabilities (ftp 
at A sizes <p^; C - 1 breakpoints 4>f^ and C interval 
probabilities <f>^ 



• t v marginal distribution: shape k and scale a 

• q* time dependence: fractional difference coeffi- 
cient and white-noise variance 

• t* time dependence: fractional difference coeffi- 
cient d^ and white-noise variance 0 (r) . 

We found that the d^ and </ (r) do not depend on c; this 
is based on empirical study and supported by theory. 
The estimated values are 0.410 and 0.41 1, respectively. 
We take the value of each of these two parameters to 
be 0.41. We found that as c increases, estimates oik, 
9 {q) and 0 (r) all tend toward 1. This means the t v tend 
to independent exponentials (a Poisson process) and 
the q v tend toward independence. In other words, the 
statistical models account for the change in t v and q Vy 
and the increase in c that was discussed earlier. We es- 
timated these three parameters and a by partial like- 
lihood methods with d^ and d {t) fixed to 0.41. The 
marginal distribution of q v on a given link does not 
change with c, but it does change from link to link. To 
generate traffic, we must specify the atom and interval 
probabilities. This provides a mean packet size /i^), 
which is measured in bits per packet. 

5. PACKET-STREAM DATA: LIVE AND SYNTHETIC 

We use packet-stream data, that is, values of packet 
arrivals and sizes, to study the bandwidth estimation 
problem. They are used as input traffic for queueing 
simulations. There are two types of streams: live and 
synthetic. The live streams are from packet traces, that 
is, data collection from live Internet links. The syn- 
thetic streams are generated by the FSD models. 

5.1 Live Packet Streams 

A commonly used measurement framework for em- 
pirical Internet studies results in packet traces (Claffy, 
Braun and Polyzos, 1995; Paxson, 1997; Caceres et al., 
2000). The arrival time of each packet on a link is 
recorded and the contents of the headers are captured. 
The vast majority of packets are transported by TCP, 
so this means most headers have 40 bytes, 20 for TCP 
and 20 for IP. The live packet traffic is measured by this 
mechanism over an interval. Time stamps provide live 
interarrival times t v , and headers contain information 
that provides live sizes q v , so for each trace there is a 
stream of live arrivals and sizes. 

The live stream data base used in this presentation 
consists of 349 streams, 90 s or 5 min in duration, from 
six Internet links that we name BELL, NZIX, AIX1, 
AIX2, MFN1 and MFN2. The measured streams have 
negligible delay on the link input router. The mean 
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number of simultaneous active connections c ranges 
from 49 connections to 18,976 connections. The traffic 
bit rate r ranges from LOO to 348 megabits/s. 

Link BELL is a 100-megabit/s link in Murray 
Hill, New Jersey that connects a Bell Labs local 
network of about 3000 hosts to the rest of the Inter- 
net. The transmission is half-duplex, so both direc- 
tions (in and out) are multiplexed and carried on the 
same link, and a stream comprises the multiplexing 
of both directions, but to keep the variable c com- 
mensurate for all six links, the two directions for 
each connection are counted as two. In this presenta- 
tion we use 195 BELL traces, each 5 min in length. 
Link NZIX is the 100-megabit/s New Zealand In- 
ternet exchange hosted by the ITS department at the 
University of Waikato, Hamilton, New Zealand, that 
served as a peering point among a number of ma- 
jor New Zealand Internet service providers at the 
time of data collection (NZIX trace data available 
at http://wand.cs.waikato.ac.nz/wand/wits/nzix/2). All 
arriving packets from the input-output ports on the 
switch are mirrored, multiplexed and sent to a port 
where they are measured. Because all connections 
have two directions at the exchange, like BELL, each 
connection counts as two. In this presentation we 
use 84 NZIX traces, each 5 min in length. Links 
AIX1 and AIX2 are two separate 622-megabit/s OC12 
packet-over-sonet links, each carrying one direction of 
traffic between NASA Ames and the MAE- West In- 
ternet exchange. In this presentation we use 23 AIX1 
and 23 AIX2 traces, each 90 s in length. The AIX1 and 
AIX2 streams were collected as part of a project at the 
National Laboratory for Applied Network Research, 
where the data are collected in blocks of 90 s (available 
at http://pma.nlanr.net/PMA). Links MFN1 and MFN2 
are two separate 2.5-gigabit/s OC48 packet-over-sonet 
links on the network of the service provider MFN; each 
link carries one direction of traffic between San Jose, 
California and Seattle, Washington. In this presenta- 
tion we use 12 MFN1 and 12 MFN2 traces, each 5 min 
in length. 

The statistical properties of streams, as we have 
stated, depend on the connection load c, so it is im- 
portant that the time interval of a live stream be small 
enough that c does not vary appreciably over the in- 
terval. For any link, there is diurnal variation, that is, 
c changes with the time of day due to changes in the 
number of users. We chose 5 min to be the upper 
bound of the length of each stream to ensure station- 
arity. The BELL, NXIZ, MFN1 and MFN2 streams are 
5 min; the AIX1 and AIX2 traces are 90 s because the 



sampling plan at these sites consisted of noncontiguous 
90-s intervals. 

5.2 Synthetic Packet Streams 

The synthetic streams are arrivals and sizes gener- 
ated by the FSD models for t v and q v . Each of the live 
streams is fitted by two FSD models, one for the t v and 
one for the q v , and a synthetic stream of 5 min is gen- 
erated by the models. The generated t v are independent 
of the generated q V9 which is what we found in the live 
data. The result is 349 synthetic streams that match the 
statistical properties collectively of the live streams. 

6. QUEUEING SIMULATION 

We study the bandwidth estimation problem through 
queueing simulation with an infinite buffer and a first- 
in-first-out (FIFO) queueing discipline. The inputs to 
the queues are the arrivals and sizes of the 349 live and 
349 synthetic packet streams described in Section 5. 

For each live or synthetic stream, we carry out 
25 runs, each with a number of simulations. For each 
run we pick a delay 8 and a delay probability co. Sim- 
ulations are carried out to find the QoS bandwidth 
the bandwidth that results in delay probability co for 
the delay 8. This also yields a QoS utilization u = z/p. 
We use five delays (0.001, 0.005, 0.010, 0.050 and 
0.100 s) and five delay probabilities (0.001, 0.005, 
0.01, 0.02 and 0.05), employing all 25 combinations 
of the two delay criteria. For each simulation of a col- 
lection, 8 is fixed a priori. We measure the queueing 
delay at the arrival times of the packets, which deter- 
mines the simulated queueing delay process. From the 
simulated process we find the delay probability for the 
chosen 8. We repeat the simulation, changing the trial 
QoS bandwidth, until the attained delay probability ap- 
proximately matches the chosen delay probability co. 
The optimization is easy because co decreases as the 
trial QoS bandwidth increases for fixed 8. 

In the optimization we do not allow the utilization 
to go above 0.97; in other words, if the true QoS uti- 
lization is above 0.97, we set it to 0.97. The reason is 
that we use the logit scale log(u/(l - u)) in the model- 
ing, and above about 0.97 the scale becomes very sen- 
sitive to model misspecification and the accuracy of the 
simulation, even though the utilizations above 0.97 for 
practical purposes are nearly equal. Similarly, we limit 
the lower range of the utilizations to 0.05. 

The result of the 25 runs for each of the 349 live and 
349 synthetic streams is 25 measurements, one per run, 
of each of five variables: QoS utilization u, delay 5, de- 
lay probability co, mean number of active connections c 
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and bit rate r . The first three variables vary from run to 
run; the last two variables are the same for the 25 runs 
for a stream because they measure the stream statistical 
properties. By design, the range of 8 is 0.001-0.100 s 
and the range of co is 0.001-0.05. The range of the 
QoS utilizations is 0.05-0.97. The two additional vari- 
ables r and c, which measure the statistical properties 
of the streams, are constant across the 25 runs for each 
stream. Variable c ranges from 49 to 18,976 connec- 
tions and x ranges from 1 .00 to 348 megabits/s. 

7. MODEL BUILDING: A BANDWIDTH FORMULA 
PLUS RANDOM ERROR 

This section describes the process of building the 
best-effort delay model, which is the best-effort delay 
formula plus random error. The model describes the 
dependence of the utilization u on the delay 8, the de- 
lay probability co, the traffic bit rate r and the expected 
number of active connections c. The modeling process 
involves both theory and empirical study, and estab- 
lishes a basis for the model. 

The theoretical basis is queueing theory. The empir- 
ical basis is the delay data from the queueing simula- 
tions, the measurements of the five variables described 
in Section 6. The following notation is used for the val- 
ues of these five variables for either the live delay data 
or the synthetic delay data. The Sj for j = 1-5 are the 
five values of the delay in increasing order and the co k 
for k = 1-5 are the five values of the delay probability 
in increasing order. The variable Uijk is the QoS uti- 
lization for delay 8j , delay probability <u* and stream i , 
where / = 1-349. For stream i, tj is the traffic bit rate 
and Ci is the mean number of active connections. 

7.1 Strategy: Initial Modeling of Dependence 
on 8 and co 

The structure of the data provides an opportunity for 
careful initial study of the dependence of the on 
8j and a>*. We have 25 measurements of each of these 
variables for each stream i , and for these measurements 
both ij and c,- are constant. We start our model building 
by exploiting this opportunity. 

We consider modeling each stream separately, but 
hope to get model consistency across streams that al- 
lows simplification. If such simplicity occurs, it is 
likely to require a monotone transformation of the uijk 
because they vary between 0 and 1. So we begin, con- 
ceptually, with a model of the form 

f{Uijk) = 8i(Sj,CO k ) + Eij k . 



The eijk are a sample from a distribution with mean 0, 
/ is a monotone function of u, and gi is a function of 
8 and co. We want to choose / to make gj as simple as 
possible, that is, to vary as little as possible with i. 

7.2 Conditional Dependence of u on 8 

We start our exploration of the data by taking 
f(u) = u and suppose that a logical scale for 8 is the 
log. In all cases we use log base 2 and indicate this 
by writing log 2 in our formulas. We do not necessar- 
ily believe that this identity function for / is the right 
transformation, but it is helpful to study the data ini- 
tially on the untransformed utilization scale. 

Our first step is to explore the conditional depen- 
dence of Uijk on log 2 (S y ) given co k and the stream i by 
trellis display (Becker, Cleveland and Shyu, 1996). For 
each combination of the delay probability co k and the 
stream i, we graph uy* against log 2 (Sy). We did this 
once for all 349 live streams and once for all 349 syn- 
thetic streams. Figure 1 illustrates this by a trellis dis- 
play for 16 of the live streams. The 16 streams were 
chosen to nearly cover the range of values of the r, . 
Let t(y) for v = 1-349 be the values ordered from 
smallest to largest, and take T( v) to be the quantile of 
the empirical distribution of the values of order u/349. 
Then we chose the 16 streams whose ranks i; yield or- 
ders closest to the 16 equally spaced orders from 0.05 
to 0.95. On the figure, there are 80 panels divided into 
10 columns and 8 rows. On each panel u^k is graphed 
against log 2 (<5 y ) for one value of co k and one stream. 
The strip labels at the top of each panel give the value 
of cok and the rank of the stream. There are five points 
per panel, one for each value of log 2 (5/). 

Figure 1 shows a number of overall effects of x, 8 
and co on u. For each pair of values of co and x, there 
is an increase in u with 8, a strong main effect in the 
data. In addition, there is an increase with r for fixed 
8 and co, another strong main effect. There is also a 
main effect for co, but smaller in magnitude than for the 
other two variables. The dependence of u on log 2 (S) 
is nonlinear, and changes substantially with the value 
of r; as r increases, the overall slope in u as a function 
of log 2 (S) first increases and then decreases. In other 
words, there is an interaction between log 2 (S) and r. 
Such an interaction complicates the dependence, so we 
search further for a transformation / of u that removes 
the interaction. This pattern occurs when all of the live 
streams or all of the synthetic streams are plotted in the 
same way. 

There is an interaction between log 2 (<5) and r in part 
because when u is close to 1 , there is little room for 
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FIG. 1 . Utilization u graphed against log delay logitf) given the delay probability (o and the stream i. 
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change as a function of log 2 (<5). For this reason, we 
tried expanding the scale at 1 by taking the function 
f(u) = log 2 (l - u). This did not achieve appreciably 
greater simplicity because nonlinearity and an interac- 
tion are still strongly present, but the interaction cause 
is behavior for smaller values of r . 

The nature of the remaining interaction for f(u) = 
log 2 (l - u) suggests that a logit transformation might 
do better: 



/(w) = logit 2 (w) = log 2 




Figure 2 plots logit 2 (w iy jt) against log 2 (fy) using 
the same streams and method as Figure 1. The logit 
function greatly simplifies the dependence. The depen- 
dence on log 2 (S) is linear. There does not appear to be 
any remaining interaction among the three variables: 
log 2 (5), r and co. To help show this, 16 lines with dif- 
ferent intercepts but the same linear coefficient have 
been drawn on the panels. The method of fitting is de- 
scribed shortly. The lines provide an excellent fit. 

7.3 Theory: The Classical Erlang Delay Formula 

The packet arrivals aj are not Poisson, although they 
do tend toward Poisson as c and r increase. The packet 
sizes, and therefore the service times, are not indepen- 
dent exponential; they have a bounded discrete distrib- 
ution and are long-range dependent, although they tend 
to independence as c and r increase. Still, we use, 
as a suggestive case, the results for Poisson arrivals 
and i.i.d. exponential service times to provide guidance 
for our model building. Erlang showed that for such 
a model the following equation holds (Cooper, 1972): 

Substituting for /? = r/w and taking the negative log of 
both sides we have 

(1) - log 2 (<w) = - log 2 (w) + log 2 (^)i-^5r. 

Because to, which ranges from 0.001 to 0.05, is small 
in the majority of our simulations compared with w, we 
have, approximately, 

1 - u 

-log 2 M = log 2 (e)— — Sr. 

Taking logs of both sides and rearranging we have 
logit 2 (w) = log 2 (log 2 (e)) + log 2 (r) 

(2) 

+ log 2 (5) - log 2 (- log 2 (cw)). 

So certain aspects of the simplicity of this classical 
Erlang delay formula occur also in the pattern for our 



much more statistically complex packet streams. In 
both cases logit 2 (w) is additive in functions of r, S 
and co y and the dependence is linear in log 2 (6). 

7.4 Conditional Dependence of u on <o 

The approximate Erlang delay formula suggests that 
we try the term - log 2 (- log 2 (o;)), the negative com- 
plementary log of co, in the model. In addition, as 
we see in Section 9, certain asymptotic results sug- 
gest this term as well. We studied the dependence of 
logit 2 («) on - log 2 (- log 2 (l -co)) for all synthetic and 
live streams using trellis display in the same way that 
we studied the dependence on log 2 (<5). Figure 3 is a 
trellis plot using the same 16 live streams as in Fig- 
ure 2. On each panel, logit 2 (w, 7 fc) is graphed against 
— log2 (~ l°g2 (^Jt)) f° r one va l ue of Sj and one stream. 

Figure 3 shows that the guidance from the Erlang 
formula is on target: logit 2 (w) is linear in 
-log 2 (-log 2 (a>)) and the slope remains constant 
' across streams and across different values of 5. To 
help show this, lines with the same linear coefficient 
but different intercepts have been drawn on the pan- 
els. The lines provide an excellent fit except for the 
errant points for high utilizations observed earlier. The 
method of fitting is described shortly. This pattern oc- 
curs when all of the live streams or all of the synthetic 
streams are plotted in the same way. 

A stream-coefficient delay model The empirical 
findings in Figures 2 and 3 and the guidance from the 
Erlang delay formula led to a very simple model that 
fits the data, 

l0git 2 (U ijk ) = fJLi + o* log 2 (S,) 

(3) 

+ 0 W (- log 2 (- log 2 + £ijk , 

where the Sijk are realizations of an error random vari- 
able with mean 0 and median absolute deviation m(e). 
The fa are stream coefficients, which change with the 
packet stream i and characterize the statistical proper- 
ties of the stream. 

We fitted the stream-coefficient delay model of (3) 
twice: once to the 349 live streams and once to the 
349 synthetic streams. In other words, we estimated the 
coefficients /x/, o& and o<y twice. Data exploration sug- 
gests that the error distribution has longer tails than the 
normal, so we used the bisquare method of robust esti- 
mation (Mosteller and Tukey, 1977). The estimates of 
os and o^ are 

Live: 6 8 =0.411, <$*> = 0.868; 
Synthetic: o s = 0.436, b w = 0.907. 
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FIG. 3. Logit utilization logit 2 (n) graphed against the negative complementary log of the delay probability — log 2 (- log 2 (&>)) given the 
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The two sets of estimates are very close in the sense 
that the fitted equation is very close, that is, results in 
very similar fitted QoS utilizations. For the 16 streams 
shown in Figures 2 and 3 the lines are drawn using the 
formula in (3) with the bisquare parameter estimates 
os, day and £/. 

Because of the long-tailed error distribution, we use 
the median absolute deviation m (e) as a measure of the 
spread. The estimates from the residuals of the two fits 
are 

Live: m(e) = 0.210; 
Synthetic: m(e) = 0.187. 

The estimates are very small compared with the varia- 
tion in logit 2 (w). In other words, the stream-coefficient 
delay model provides a very close fit to the logit 2 (w/yjt) • 
Of course, this was evident from Figures 2 and 3 be- 
cause the fitted lines are quite close to the data. 

7.5 Strategy: Incorporating Dependence on 
r and c for Practical Estimation 

The coefficient \ii in the stream-coefficient delay 
model of (3) varies with the packet stream and reflects 
how the changing statistical properties of the streams 
affect the QoS utilization. Part of the simplicity of the 
model is that a single number characterizes how the 
statistical properties of a stream affect the QoS band- 
width. However, the model cannot be used as a practi- 
cal matter for bandwidth estimation because it requires 
a value of fi, which would not typically be known. 
If we knew the traffic characteristics in detail for the 
link, for example, if we had FSD parameters, we could 
generate traffic and run simulations to determine jjl and 
therefore the bandwidth. This might be possible in cer- 
tain cases, but in general is not feasible. 

What we must do is start with (3) and find readily 
available variables that measure stream statistical prop- 
erties and can replace jx in the stream-coefficient delay 
model. We carry out this task in the remainder of this 
section. Two variables replace \i\ the bit rate x and the 
mean number of active connections c, with their values 
of r,- and c, for each of our packet streams. We use 
both theory and empirical study, as we did for the 
stream-coefficient delay model, to carry out the model 
building. 

7.6 Theory: Fast-Forward Invariance, Rate Gains 
and Multiplexing Gains 

Figures 2 and 3 show that the QoS utilization u in- 
creases with r. There are two causes: rate gains and 
multiplexing gains. Because c is positively correlated 



with r, u increases with c as well. However, r and c 
measure different aspects of the load, which is impor- 
tant to the modeling. The bit rate r is equal to cy&, 
where yj,, the connection bit rate in bits/s per con- 
nection, measures the end-to-end speed of transfers, 
and c measures the amount of multiplexing. An in- 
crease in either increases x. 

First we introduce fast forwarding. Consider a gen- 
eralized packet stream with bit rate r input to a queue 
without any assumptions about the statistical proper- 
ties. The packet sizes can be any sequence of positive 
random variables and the interarrivals can be any point 
process. Suppose we are operating at the QoS utiliza- 
tion u = x/p for QoS delay criteria 8 and co. Now for 
h > 1 we speed up the traffic by dividing all inter- 
arrival times t v by h. The packet stream has a rate 
change: the statistical properties of the t v change only 
by a multiplicative constant. A rate increase of h in- 
creases Yb by th e factor h but not c. The bit rate x 
changes to /it. Suppose we also multiply the band- 
width fi by h 9 so that the utilization u is constant. Then 
the delay process of the rate-changed packet stream is 
the delay process for the original packet stream divided 
by h. That is, if we carried out a simulation with a live 
or synthetic packet stream and repeated the simulation 
with the rate change, then the delay of each packet in 
the second simulation would be the delay in the first di- 
vided by h. The traffic bit rate, the bandwidth and the 
delay process are speeded up by the factor h 9 but the 
variation of the packet stream and the queueing other- 
wise remain the same. If we changed our delay criter- 
ion from 8 to 8/h, then the QoS utilization u would 
be the same, which means the QoS bandwidth is hp. 
It is as if we videotaped the queueing mechanism in 
the first simulation and then produced the second by 
watching the tape on fast forward with the clock on 
the tape player running faster by the factor h as well. 
We call this phenomenon fast-forward invariance. 

Let us now reduce some of the speedup of the fast 
forwarding. We divide the t v by h, which increases y& 
by the factor h, but we hold 8 fixed and do not decrease 
by the factor l/h. What is the new QoS u that satisfies 
the delay criteria 8 and a>? Since u satisfies the criteria 
for delay 8/ h, we have room for more delay, so u can 
increase. In other words, a rate increase results in uti- 
lization gains for the same 5. This is the rate gain. 

Now suppose we hold x fixed but increase c by the 
factor h > 1. This means that yi> must be reduced by 
the factor l/h. Now the statistical properties change in 
other ways due to the increased multiplexing. As we 
saw in Section 4, the t v tend toward Poisson and the 
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Fig. 4. Do/ plot of median log connection bit rate \og 2 (Yb) for 
six Internet links. 

q v tend toward independence. The dissipation of the 
long-range dependence of the packet streams, as well 
as the tendency of the marginal distribution of t v to- 
ward exponential, tends to decrease the queueing delay 
distribution and thereby increase the QoS utilization. 
In other words, there are statistical multiplexing gains. 

These theoretical considerations lead us to two im- 
portant conclusions about modeling. First, we want to 
be sure that whatever model results, it must obey the 
principle of fast-forward invariance. Second, it is un- 
likely to be enough to model with just r. If Yb were 
constant across the Internet, r and c would measure 
exactly the same thing for our purposes and we would 
have no need for c beyond r, but if Yb changes sub- 
stantially, as seems likely, then we will need c as well. 
Figure 4 shows the six medians from the 349 values of 
log 2 (y&) for our live streams broken up into six groups 
by the link. The range of the medians is about 4 log 
base 2 bits/s per connection, which means that the me- 
dians of yb change by a factor of 16. One link, NZIX, 
is appreciably slower than the others. 

7.7 Modeling with x and c 

We begin by modeling just with r to see if this can 
explain the observed utilizations without c. The ap- 
proximate Erlang delay formula in (2) suggests that the 
dependence of the stream coefficients on r is linear in 
log 2 (r). Th^ means the model for logit 2 (w/^) is 



(4) 



l0git 2 (w,7*) = O + O x l0g 2 (T|) + OH log 2 (5/) 

+ Oa>(- l0g 2 (- log 2 (<»*))) + fijk, 



where the yfr^k are realizations of an error random vari- 
able with mean 0. In our initial explorations for the fit 
and the residuals we discovered that the spread of the 
residuals increased with increasing Sj. So we model 
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FIG. 5. Dot plot of link median residuals from the first fit to the 
logit utilization using only the bit rate x to characterize stream sta- 
tistical properties. 



the median absolute deviation of the i/r^k by msjiifr), 
allowing it to change with Sj. 

We fitted the model to the live data and to the syn- 
thetic data using the bisquare and also accommodating 
the changing value of (^). Figure 5 shows dot plots 
of the six medians of the residuals for the six links. 
There is a clear link effect, mimicking the behavior 
in Figure 4: The two links with the largest and small- 
est residual medians are the two with the largest and 
smallest median connection bit rates. The behavior of 
these extremes is what we would expect. For example, 
NZIX has the smallest median y^, so its bit rate under- 
predicts the utilization because a stream at NZIX with 
a certain r has more than average multiplexing than 
streams at other links with the same r, which means 
the favorable statistical properties push the utilization 
higher than expected under the model. The same plot 
for the synthetic streams shows the same effect. 

In addition, there is another inadequacy of this first 
model. Because Yb is changing, we want the model 
to obey the principle of fast-forward invariance, but 
it does not because the estimates of o x and o& are 
not equal. 

We enlarge the bandwidth model by adding the vari- 
able log 2 (c). Because log 2 (r) is used in the initial 
model, adding log 2 (c) is equivalent to adding y&. In 
doing this we want an equation that obeys fast-forward 
invariance: If we hold c fixed, multiply r by /i and 
divide S by h y then we do not want a change in the 
QoS utilization. This is achieved by the best-effort de- 
lay model 



(5) 



logit 2 (M,7*) = o + o c log 2 (c,) + o z s log 2 (r,S,) 

+ Oa,{- l0g 2 (- l0g 2 (0>*))) + ifoj*. 
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where the fijk are error variables with mean 0 and me- 
dian absolute deviation msjdr). Fast-forward invari- 
ance is achieved by entering x and 8 as a product. 

We fitted the enlarged (5) to the live streams and 
to the synthetic streams using the bisquare because 
our exploration showed that the error distribution has 
longer tails than normal. The estimation included a nor- 
malization to adjust for the m 5> (VO* The bisquare esti- 
mates of o, o c , o r & and o w are 



Live: 



o = -8.933, 


Oc 


= 0.420, 


6 tS = 0.444, 


Oa) 


= 0.893; 


6 = -8.227, 


O c 


= 0.353, 


5 Ti = 0.457, 


Oa) 


= 0.952. 



Synthetic: 



The two sets of estimates are very close in the sense 
that the fitted equations are close. The estimates of 
m&j the median absolute deviations msj (f), of the 
residuals are 



Delay: 


1 ms, 


5 ms, 


10 ms, 




50 ms, 


100 ms; 




Live: 


0.211, 


0.312, 


0.372, 




0.406, 


0.484; 




Synthetic: 


0.169, 


0.322, 


0.356, 




0.380, 


0.457. 





Again, the two sets of estimates are close. 

It is important to consider whether the added vari- 
able c contributes in a significant way to the variability 
in logit 2 (u,j*) and does not depend fully on the single 
link NZIX. We used the partial standardized residual 
plot in Figure 6 to explore this. The standardized resid- 
uals of regressing the logit utilization, logit 2 (w), on the 
predictor variables except log 2 (c) are graphed against 
the standardized residuals from regressing log 2 (c) on 
the same variables. The partial regressions are fitted us- 
ing the final bisquare weights from the full model fit, 
and the standardization is a division of the residuals by 
the estimates m*, (^). Figure 6 shows that log 2 (c) has 
explanatory power for each link separately and not just 
across links. We can also see from the plot that there is 
a remaining small link effect, but a minor one. This is 
also demonstrated in Figure 7, which is the same plot 
as Figure 5, but for the enlarged model The horizontal 
scales on the two plots have been made the same to fa- 
cilitate comparison. The major link effect is no longer 
present in the enlarged model. The result is the same 
for the same visual display for the synthetic data. 



7.8 Alternative Forms of the Best-Effort 
Bandwidth Formula 

The best-effort delay formula of the best-effort delay 
model in (5) is 



(6) 



logit 2 («<,7*) = o + Oc log 2 (c/) + Ors log 2 (r;S/) 

+ Oo,{- l0g 2 (-l0g 2 (<W*)))- 

Since r = cy&, the formula can be rewritten 

logit 2 (w) = o + (o c + o r8 ) log 2 (c) 
(7) +o T alog 2 (y^) 

+ Oa,(- l0g 2 (-l0g 2 (0/»). 

In this form we see the action of the amount of multi- 
plexing of connections as measured by c and the end- 
to-end connection speed as measured by Yb- An in- 
crease in either results in an increase in the utilization 
of a link. 

7.9 Modeling the Error Distribution 

As we have discussed, our study of the residuals 
from the fit of the best-effort delay model showed 
that the scale of the residual error distribution in- 
creases with the delay. The study also showed that 
log 2 (m Ej (i/)) is linearly related to log 2 (S). From the 
least squares estimates for the live data, the estimate of 
the intercept of the regression line is -0.481, the es- 
timate of the linear coefficient of the line is 0.166 and 
the estimate of the standard error is 0.189. (Results are 
similar for the synthetic data.) 

We also found that when we normalized the residuals 
by the estimates ms ; (V0> the resulting distribution of 
values is very well approximated by a constant times a 
t distribution with 15 degrees of freedom. Because the 
normalized residuals have a median absolute deviation 
of 1 and t\s has a median absolute deviation of 0.691, 
the constant is 0.691 _1 . We use this modeling of the 
error distribution for the bandwidth prediction in Sec- 
tion 8. 

8. BANDWIDTH ESTIMATION 

The best-effort delay model in (5) can be used to es- 
timate the bandwidth required to meet QoS criteria on 
delay for best-effort Internet traffic. We describe here a 
conservative procedure in the sense that the estimated 
bandwidth is unlikely to be too small. In doing this we 
use the coefficient estimates from the live delay data. 
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Partial Residua! Connection Load (log base 2 number of connections) 

Fig. 6. A partial residual plot for the explanatory variable log2(c) for the best-effort delay model given each of the six Internet links. 
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Fig. 7. Dot plot of link median residuals for the best-effort delay 
model. 



First, we estimate the expected logit utilization 
t = logit 2 («) by 

C = -8.933 + 0.420 log 2 (c) + 0.444 log 2 (rS) 
+ 0.893 (-log 2 (-log 2 (w))). 

On the utilization scale this is 

. 2« 
u = -. 

l + 2 l 

Next we compute a predicted median absolute devia- 
tion from the above linear regression: 

^ (l/r) = 2-0.481+0.166 log 2 5 

Let,t\s(p) be the quantile of probability p of a t dis- 
tribution with 15 degrees of freedom. Then the lower 
limit of a 100(1 — p)% tolerance interval for i is 

i{p) =i-m & W)t l5 (p) /0.691. 

For p = 0.05, t\s(p) = 1.75, so the lower 95% limit is 

€(0.05)=i-2.53m 5 (^). 

The lower 95% limit on the utilization scale is 

2^(0.05) 



w(0.05) = 



1 + 2^ 0 05 ) ' 



This process is illustrated in Figure 8. For the figure, 
Yb was taken to be 2 14 bits/s per connection. On each 
panel the values of r and co are fixed to the values 
shown in the strip labels at the tops of the panels, and 
u and u(0.05) are both graphed against log 2 (S) for 
5 varying from 0.001 to 0.1 s. 



9. OTHER WORK ON BANDWIDTH ESTIMATION 
AND COMPARISON WITH THE RESULTS HERE 

Bandwidth estimation has received much attention 
in the literature. The work focuses on queueing be- 
cause the issue driving estimation is queueing. Some 
work is fundamentally empirical in nature in that it uses 
live streams as inputs to queueing simulations or syn- 
thetic streams from models that have been built with 
live streams, although theory can be invoked as well. 
Other work is fundamentally theoretical in nature in 
that the goal is to derive properties of queues math- 
ematically, although live data are sometimes used to 
provide values of parameters so that numerical results 
can be calculated. Most of this work uses derivations 
of the delay exceedance probability as a function of 
an input source to derive the required bandwidth for 
a given QoS requirement. The delay exceedance prob- 
ability is equivalent to our delay probability, where the 
buffer size is related to the delay by a simple multipli- 
cation of the link bandwidth. Since exact calculations 
of the delay probability are only feasible in special 
cases, these methods seek an approximate analysis, for 
example, using asymptotic methods, stochastic bounds 
or, in some cases, simulations. There has been by far 
much more theoretical than empirical work. 

The statistical properties of the traffic stream, which 
have an immense impact on the queueing, receive at- 
tention to varying degrees. Investigators who carry out 
empirical studies with live streams do so as a guaran- 
tee of recreating the properties. Those who carry out 
studies with synthetic traffic from models must argue 
for the validity of the models. Much of the theoretical 
work takes the form of assuming certain stream proper- 
ties and then deriving the consequences, so the problem 
is solved for any traffic that might have these proper- 
ties. Sometimes, though, the problem is minimized by 
deriving asymptotic results under general conditions. 

9.1 Empirical Study 

Our study here falls in the empirical category, but 
with substantial guidance from theory. To estimate ex- 
ceedance probabilities, we run simulations of an in- 
finite buffer, FIFO queue with fixed utilization using 
live packet streams or synthetic streams from the FSD 
model as the input source. 

The tradition for using live Internet streams in a 
queueing simulation began early in the study of Inter- 
net traffic. In a very important study it was shown that 
long-range dependent traffic results in much greater 
queue-length distributions (Erramilli, Narayan and 
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FlG. 8. The QoS utilization from the best-effort delay model graphed against log delay given the bit rate and the delay probability. The bit 
connection rate Yb is taken to be 2 14 bits/s per connection. The upper curve on each panel estimates the expected values and the lower curve 
gives the minimum values of 95% tolerance intervals. 
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Willinger, 1996) than Poisson traffic. This was an im- 
portant result because it showed that the long-range 
dependence of the traffic would have a large impact 
on Internet engineering. In other studies, queueing 
simulations of both live and altered live traffic are 
used to study the effect of dependence properties and 
multiplexing on performance using the average queue 
length as the performance metric (Erramilli, Narayan, 
Neidhardt and Saniee, 2000; Cao, Cleveland, Lin and 
Sun, 2001). 

In Mandjes and Boots (2004), queueing simulations 
of multiplexed on-off sources are used to study the in- 
fluence of on and off time distributions on the shape of 
the loss curve and on performance. To improve the ac- 
curacy of the delay probabilities based on simulations, 
techniques such as importance sampling are also con- 
sidered (Boots and Mandjes, 2002). 

One study (Fraleigh, Tobagi and Diot, 2003) first 
used live streams (Internet backbone traffic) to validate 
a traffic model: an extension of fractional Brownian 
motion (FBM) known as a two-scale FBM process. 
They did not model the packet process, but rather 
modeled bit rates as a continuous function. Then they 
derived approximations of the delay exceedance prob- 
ability from the model, which served as the basis for 
their bandwidth estimation. Parameters in the two- 
scale FBM model that appear in the formula are related 
to the bit rate x using the data. This is similar to our 
process here where we relate the stream coefficients to 
c and t. Fraleigh, Tobagi and Diot also used queueing 
simulations to determine delay exceedance probabili- 
ties as a method of validation. We compare their results 
and ours at the end of this section. 

9.2 Mathematical Theory: Effective Bandwidth 

A very large number of publications have been writ- 
ten in an area of bandwidth estimation that is referred 
to as effective bandwidth. The effective bandwidth of 
an input source provides a measure of its resource us- 
age for a given QoS requirement, which should lie 
somewhere between the mean rate and the peak rate. 
Let A(t) be the total workload (e.g., bytes) generated 
by a source in the interval [0, t]. The mathematical de- 
finition of the effective bandwidth of the source (Kelly, 
1996) is 



(8) 



1 



a(j,0 = -logE[e ,ilw ] f 
st 



0 < s, t < oo, 



for some space parameter 5 and time parameter t. For 
the purpose of bandwidth estimation, the appropriate 
choice of parameters depends on the traffic character- 
istics of the source and the QoS requirements, as well 



as the properties of the traffic with which the source 
is multiplexed. Subsequently we discuss the effective 
bandwidth approach to bandwidth estimation based on 
approximating the delay probability in the asymptotic 
regime of many sources. 

Consider the delay exceedance probability for a 
FIFO queue on a link with constant bit rate. In the 
asymptotic regime of many sources, we are con- 
cerned with how the delay probability decays as 
the size of the system increases. Suppose there are 
n sources and the traffic generated by the n sources 
is identical, independent and stationary. The number 
of sources n grows large at the same time that re- 
sources such as the link bandwidth ft and buffer sizes 
scale proportionally, so the delay 8 stays constant. Let 
p = nfio for some fio and let Q n be the queueing de- 
lay. Under very general conditions, it can be shown 
that (Botvich and Duffield, 1995; Courcoubetis and 
Weber, 1996; Simonian and Guibert, 1995; Likhanov 
and Mazumdar, 1998; Mandjes and Kim, 2001) 

(9) lim -n~ } logP(0„ > 8) = 1(8, ft>), 
n— ►oo 

where 

(10) 7(5, Po) = inf sup(*j8o(S + 0 - sta(s, 0), 

t >0 s 

and is sometimes referred to as the loss curve in the 
literature. Let (s*,t*) be an extremizing pair in (10). 
Then aC?*, t*) is the effective bandwidth for the sin- 
gle source as defined in (8) and nct(s* y t*) is the 
effective bandwidth for the n sources. For a QoS 
requirement of a delay 8 and a delay probability w, 
approximating the delay probability Y(Q n > 8) us- 
ing exp(— n/(<5, £o)) [equation (9)], the bandwidth re- 
quired for the n sources can be found by solving the 
following equation for ft: 

(1 1) s*p(8 + 1*) - s*t*na(s*,t*) = - log w. 



This gives 



logu; 



(,2) f = ' " • 

which is the effective bandwidth solution to the band- 
width estimation problem. If the delay 8 -» oo, then 
the extremizing value of t* approaches oo and the 
bandwidth in (12) reduces to 

n lim a(s*,t*), 

and we recover the classical effective bandwidth de- 
finition of a single source linv_>.oo <*(**, t*) for the 
large buffer asymptotic model (Elwalid and Mitra, 



538 



J. CAO, W. S. CLEVELAND AND D. X. SUN 



1993; Guerin, Ahmadi and Naghshineh, 1991; Kesidis, 
Walrand and Chang, 1993; Chang and Thomas, 1995). 
If the delay S 0, then the extremizing pair t* -> 0 
and s*t* s for some J, the bandwidth in (12) re- 
duces to 

n hm or I — , r ) z — 

t*^oo \t* J S 

and we recover the effective bandwidth definition 
lim f *_>oo '*) for the bufferless model (Hui, 

1988). 

As we can see, the effective bandwidth solution 
requires evaluation of the loss curve /(<S,/?o) [equa- 
tion (10)]. However, an explicit form of the loss curve 
is generally not available. One approach is to derive 
approximations of the loss curve under buffer asymp- 
totic models, that is, the large buffer asymptotic model 
(5 oo) or the bufferless model (8 -> 0), for some 
classes of input source arrivals. For example, if the 
source arrival process is Markovian, then for some 
r) > 0 and v (Botvich and Duffield, 1995), 

lim I(8,p 0 )-r)8 = v. 

If the source arrival is fractional Brownian motion with 
Hurst parameter H, then for some v > 0 (Duffield, 
1996) 

lim I(8,po)/S 2 ~ 2H = v. 

For an on-off fluid arrival process, it is shown that as 
8 0 for some constants rj(p 0 ) and v(/?o) (Mandjes 
and Kim, 2001), 

/(S, Po) - fl(Po) + v(Po)V8 + 0(8), 

and as 8 -> oo for some constant 0(Po) (Mandjes and 
Boots, 2002), 

/(Mo)~0(A)MS), 

where v(8) = — log P (residual on period > 8). How- 
ever, it is found that bandwidth estimation based on 
buffer asymptotic models suffers practical problems. 
For the large buffer asymptotic model, the estimated 
bandwidth could be overly conservative or optimistic 
because it does not take into account the statistical mul- 
tiplexing gain (Choudhury, Lucantoni and Whitt, 1994; 
Knightly and Shroff, 1999). For the bufferless model, 
there is a significant utilization penalty in the estimated 
bandwidth (Knightly and Shroff, 1999) since results in- 
dicate that there is a significant gain even with a small 
buffer (Mandjes and Kim, 2001). 

Another approach proposed by Courcoubetis, Siris 
and Stamoulis (1999) and Courcoubetis and Siris 



(2001) is to numerically evaluate the loss curve 
I(8,Po)- First, these authors evaluated the effective 
bandwidth function a CM) [equation (8)] empirically 
based on measurements of traffic byte counts in fixed 
size intervals. Then they obtained the loss curve [equa- 
tion (10)] using numeric optimizing procedures with 
respect to the space parameter s and the time para- 
meter t. As examples, they applied this approach to 
estimate bandwidth where the input source is a Bell- 
core Ethernet WAN stream or streams of incoming IP 
traffic over the University of Crete's wide area link. 
Their empirical approach is model-free in the sense 
that it does not require a traffic model for the input 
source and all evaluations are based on traffic mea- 
surements. However, their approach is computation- 
ally intensive, not only because the effective bandwidth 
function a(s, t) has to be evaluated for all time parame- 
ters t, but also because the minimization with respect 
to t is nonconvex (unlike the maximization in the space 
parameter s) and thus difficult to perform numerically 
(Gibbens and Teh, 1999; Kontovasilis, Wittevrongel, 
Bruneel, Van Houdt and Blondia, 2002). 

In the effective bandwidth approach, one typically 
approximates the buffer exceedance probability based 
on its logarithmic asymptote. For example, in the as- 
ymptotic regime of many sources, using (9), one can 
approximate 

(13) P((2,>5)^exp(-n/(5^o)). 

An improved approximation can be found by incorpo- 
rating a prefactor, that is, 

?(Qn > *) * K(n, 5, #))exp(-n/(S, ft)). 

Using the Bahadur-Rao theorem, such approximation 
has been obtained for the delay exceedance probabil- 
ity in the infinite buffer case as well as the cell loss 
ratio in the finite buffer case that has the same logarith- 
mic asymptote but a different prefactor (Likhanov and 
Mazumdar, 1998). 

9.3 Theory: Other Service Disciplines 

Some authors have investigated service disciplines 
other than FIFO, such as general processor shar- 
ing (Zhang, Towsley and Kurose, 1994) and priority 
queueing (Berger and Whitt, 1998). Although TCP is 
the most dominant protocol in today's Internet, we do 
not consider the effect of the TCP feedback control 
mechanism since the link we sought for estimating 
a bandwidth is not a bottleneck link. To account for 
the TCP feedback control, other authors have studied 
characteristics of bandwidth sharing for elastic traf- 
fic and investigated the bandwidth estimation prob- 
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lem for such traffic (de Veciana, Konstantopoulos 
and Lee, 2001; Ben Fred, Bonald, Proutiere, Regnie 
and Roberts, 2001). Again other authors have con- 
sidered regulated input traffic such as that from a 
leaky bucket (Elwalid, Mitra and Wentworth, 1995; 
Lo Presti, Zhang, Kurose and Towsley, 1999; Kesidis 
and Konstantopoulos, 2000; Chang, Chiu and Song, 
2001). 

9.4 Theory: Direct Approximations of the 
Delay Probability 

Besides approximating the delay exceedance proba- 
bility using the effective bandwidth approach, some au- 
thors have considered direct approximations for some 
special classes of input traffic models. For example, 
for a Markov modulated fluid source, the delay prob- 
ability can be more accurately expressed as a single 
exponential with a prefactor K determined from the 
loss probability in a bufferless multiplexer as estimated 
by ChernofFs theorem (Elwalid, Heyman, Lakshman, 
Mitra and Weiss, 1995). For an aggregate Markov 
modulated fluid source, the delay probabilities can be 
approximated by a sum of exponentials (Shroff and 
Schwartz, 1998). For a Gaussian process, a tight lower 
bound of the delay probability can be obtained using 
maximum-variance based approaches (Norros, 1994; 
Knightly, 1997; Choe and Shroff, 1998; Fraleigh, 
Tobagi and Diot, 2003). These expressions can be used 
in place of (13) to derive the required bandwidth for a 
QoS requirement. Readers are referred to Knightly and 
Shroff (1999) for a nice overview and comparison of 
these approaches as well as the aforementioned effec- 
tive bandwidth approach for bandwidth estimation. 

9.5 Theory: Queueing Distributions 

We now discuss implications of our stream-coeffi- 
cient delay formula and best-effort delay formula, and 
their relationship to some previous work. The stream- 
coefficient delay model in (3) implies that for each 
stream i, 

w*?(Qi >S) 

* exp(- log 2 • 2*'°* {jZTz) l/ ° W S ° S/ ° W ) 

for stream coefficient ji,- and regression coefficients 
o w , os. This suggests that the tail distribution of queue- 
ing delay is Weibull with shape parameter o?>o~ ] . The 
Weibull form is consistent with the FBM traffic model 
(and also the two-scale FBM model), but there the 
shape parameter is 2-2//. Notice that o& o~ ] from 



our analysis is 0.52 for the real data and 0.42 for the 
synthetic data, which is quite different from the shape 
parameter computed from 2 - 2H = 0. 1 8. If the bit rate 
per connection y& is a fixed constant, the best-effort de- 
lay formula in (5) implies that for some constant o', 

PP(Qi >S) 




If o c + o z s = o w and the traffic bit rate r, is a multi- 
ple of x (i.e., %i = n, t), then the above approximation 
is consistent with the effective bandwidth result with 
many sources of asymptotics [equation (9)]. In our em- 
pirical analysis we found o r & + o c and o w to be quite 
close; the ratio (o x& + o c )o~ l is 0.97 for real data and 
0.85 for synthetic data. One of the reasons that this ra- 
tio is not 1 is possibly because (9) is an asymptotic 
formula. 

9.6 Comparison of the Results Presented Here 
with Other Work 

The work presented in this article resulted in a sim- 
ple formula for bandwidth estimation. At the same 
time, validation has been extensive, permeating all ar- 
eas of the work. Validation is carried out in two ways: 
empirically and theoretically. 

The large number of papers in the area of effec- 
tive bandwidth and other theoretical work cited above 
have yielded much insight. This work has posited traf- 
fic stream models and investigated the resulting math- 
ematical properties. However, for best-effort Internet 
traffic there has been no extensive study to deter- 
mine whether some posited model accurately describes 
the stream statistical properties nor has there been 
extensive work in the form of empirical queueing sim- 
ulations to determine whether queueing results for 
best-effort traffic fit the theory. Consequently, the sim- 
ple best-effort delay formula, which is not readily 
derivable without a hint of the final results, was not 
discovered. 

The interesting paper cited above that used the two- 
scale FBM model surely took great pains to validate the 
model (Fraleigh, Tobagi and Diot, 2003). One prob- 
lem with this approach — modeling traffic bit flow as 
a fluid rather than the packet process as it appears 
on the link — is that the Gaussian assumption does not 
take hold until the level of aggregation is quite high. 
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Consequently, the FBM model is not a good approxi- 
mation until the traffic rate is 50 megabits/s and above, 
so their bandwidth estimation model is not validated 
below 50 megabits/s. By contrast, our best-effort de- 
lay model is valid to as low as 1 megabit/s. However, 
the ensuing methods used by Fraleigh, Tobagi and Diot 
(2003) to find the QoS bandwidth require a series of ap- 
proximations and a worst-case empirical method in the 
estimation of parameters. There appears to have been 
little checking of these approximations. The bandwidth 
results appear to us to be inaccurate, possibly aris- 
ing from some of the approximations. First, as the bit 
rate increases up to 1 gigabit/s, the utilization appears 
to stabilize at values less than 1 and substantially so 
in some cases. As our theoretical discussion of rate 
gains and multiplexing gains demonstrates, the utiliza- 
tion must increase to 1 as the bit rate increases. This 
is the case for our best-effort delay model. For exam- 
ple, the utilization for a delay of 10 ms, a probabil- 
ity of 0.01 and a bit rate of 1 gigabit/s is 90% from 
Figure 8 of Fraleigh, Tobagi and Diot (2003), but is 
98% for our model. In addition, the model in Fraleigh, 
Tobagi and Diot (2003) works simply with the bit rate 
rather than decomposing into the number of active con- 
nections times the bit rate per connection and using 
two variables, as is done in the best-effort delay model 
here. As we have demonstrated theoretically and em- 
pirically, the bit rate is not sufficient to account for 
the utilization since a fast network and a network with 
a high traffic connection load must be distinguished. 

10. RESULTS AND DISCUSSION 

10.1 Problem Formulation 

Suppose the packet stream — packet arrival times and 
s i zes — arriving for transmission on an Internet link is 
best-effort traffic with bit rate r bits/s and number of 
simultaneous active connections c. Suppose the link in- 
put buffer is large enough that packet loss is negligible. 
Our goal is to estimate the QoS bandwidth p in bits/s 
or, equivalently, the QoS utilization u — r/fi, that sat- 
isfies QoS criteria for the packet queueing delay in the 
link input buffer. The criteria are a delay 8 in seconds 
and the probability co that the delay for a packet ex- 
ceeds <5. 

10.2 Other Work on the Problem 

There is a wide literature on the bandwidth esti- 
mation problem. Much of it is theoretical, that is, 
mathematical results that derive properties of queue- 
ing systems. A smaller literature is empirical in nature, 



based on simulations with packet stream inputs from 
measurements on live links or from models for traffic. 
The classical Erlang delay formula provides a simple 
formula that can be used to estimate traffic streams that 
in theory have Poisson arrivals and i.i.d. exponential 
sizes. Best-effort traffic is much more complex: It is 
nonlinear, long-range dependent and, to date, has no 
simple, validated formula to describe it. 

10.3 Principal Result: The Best-Effort Delay Model 

The principal result of this paper is a statistical 
model that provides a simple, validated formula for the 
estimation of bandwidth for best-effort traffic that per- 
forms in the same way that the Erlang delay formula 
does for the Poisson-exponential case. The model has 
been validated through extensive empirical study and 
through consistency with certain theoretical properties 
of queueing. 

The model consists of the best-effort delay formula 
plus random variation, 

logit 2 (H) = O + O c log 2 (c) + O r 8 log 2 (tS) 

+ Oa>{- log 2 (- l0g 2 M)) + T/r, 

where \fr is a random error variable with mean 0 and 
median absolute deviation m&(\//) which depends on 8; 
log 2 is the log base 2; and logit 2 (w) = log 2 (u/(l - w)). 
The distribution of 0.691 ^/m^^r) is a t distribution 
with 15 degrees of freedom. Estimates of the coeffi- 
cients of the model are 

o = -8.933, o c = 0.420, 
5 Z& = 0.444, a w = 0.893. 

The expression ms(^) is modeled as a function of 8: 
log 2 (ma(V0) is a linear function of log 2 (5) plus ran- 
dom variation. The estimate of the intercept of the line 
is -0.481, the estimate of the linear coefficient of the 
line is 0.166 and the estimate of the standard error 
is 0.189. The bit rate r is equal to cyt, where yb is 
the connection bit rate in bits/s per connection. So the 
best-effort delay formula can also be written 

logit 2 (w) = o + (o c + o r s) log 2 (c) + o x8 log 2 (y^) 

+ Oa>{- l0g 2 (- l0g 2 M)). 

In this form we see the action of the amount of multi- 
plexing of connections as measured by c and we see the 
end-to-end connection speed as measured by y^,. An in- 
crease in either results in an increase in the utilization 
of a link. 
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The best-effort delay model is used to estimate the 
bandwidth required to carry best-effort traffic given 8, 
co, x and c. The QoS logit utilization is estimated by 

i = -8.933 + 0.420 log 2 (c) + 0.444 log 2 (rS) 

+ 0.893(-log 2 (-log 2 (a>))), 

so the QoS utilization is estimated by 

u = 

1+2* 

The corresponding estimated bandwidth is r/w. For 
such an estimate there is a 50% chance of being too 
large and a 50% chance of being too small. We could, 
however, use a more conservative estimate that pro- 
vides a much smaller chance of too little bandwidth. 
Let 

^ (l/r) = 2 -0.481+0.1661og 2 (6) 

be the estimate of m (8) . Let t\$ (p) be the lower 1 00 p% 
percentage point of a t distribution with 15 degrees of 
freedom, where p is small, say 0.05. Let 

£(/7) = £-m(^ 15 (p)/0.691. 

Then 



is a conservative utilization estimate, the lower limit of 
a 100/?% tolerance interval for the QoS utilization. The 
corresponding estimated bandwidth is r/u(p). 

10.4 Methods 

The best-effort delay model was built, in part, from 
queueing theory. Certain predictor variables were sug- 
gested by the Erlang delay formula. Theory prescribes 
certain behavior as r, c or yt increases, resulting in 
rate gains, multiplexing gains or fast-forward invari- 
ance, and the model was constructed to reproduce the 
behavior. 

The best-effort delay model was built, in part, from 
results of queueing simulations with traffic stream in- 
puts of two types: live and synthetic. The live streams 
are measurements of packet arrivals and sizes for 
349 intervals, 90 s or 5 min in duration, from six Inter- 
net links. The synthetic streams are arrivals and sizes 
generated by recently developed FSD time series mod- 
els for the arrivals and sizes of best-effort traffic. Each 
of the live streams was fitted by two FSD models (one 
for the interarrivals and one for the sizes) and a syn- 
thetic stream of 5 min was generated by the models. 



The generated interarrivals are independent of the gen- 
erated sizes, which is what we found in the live data. 
The result is 349 synthetic streams that match the sta- 
tistical properties collectively of the live streams. For 
each live or synthetic stream, we carried out 25 runs, 
each with a number of simulations. For each run we 
picked a delay 8 and a delay probability co; simulations 
were carried out to find the QoS bandwidth which is 
the bandwidth that results in delay probability co for 8. 
This also yields a QoS utilization u = r/p. We used 
five delays (0.001, 0.005, 0.010, 0.050 and 0.100 s) 
and five delay probabilities (0.001, 0.005, 0.01, 0.02 
and 0.05), and employed all 25 combinations of the 
two delay criteria. The queueing simulation results in 
delay data, that is, values of five variables: QoS utiliza- 
tion w, delay 8, delay probability co, the mean number 
of active connections of the traffic c and the traffic bit 
rate r . The delay data were used in the model building. 

10.5 Validity and Applicability 

Extensive data exploration with visualization tools 
(some shown here) demonstrates that the best-effort 
delay model fits the simulation delay data. This, of 
course, is necessary for the model to be valid. In ad- 
dition, validity is supported by the model reproducing 
the theoretical queueing properties as just discussed. 

The validity of the best-effort delay model depends 
on the validity of the traffic streams used as inputs 
to the queueing simulation; that is, the packet streams 
must reproduce the statistical properties of best-effort 
streams. Of course, the live streams of the study do so 
because they are best-effort traffic. Extensive valida- 
tion has shown that the FSD models used to generate 
the packet streams here provide excellent fits to best- 
effort packet streams when c is above about 64 con- 
nections, which for a link where yb is about 2 14 bits/s 
per connection means r is above about 1 megabit/s. 
For this reason, only traffic streams with r greater than 
this rate are used in the study, and the best-effort delay 
model is valid above this rate. 

The results are only valid for links with a buffer large 
enough that the packet loss is negligible. We have used 
open-loop study, which does not provide for the TCP 
feedback that occurs when loss is significant. This re- 
striction also holds for the other work on bandwidth 
estimation cited here. 

There is also a practical restriction on applicability. 
We have taken the range of our study to include traffic 
bit rates as low as about 1 megabit/s. We have done 
this simply because we can do so and achieve valid 
results, but even for the least stringent of our delay 
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criteria (8 = 0.1 -s delay and co = 0.05 delay probabil- 
ity), the utilizations are low for rates in the range of 
1-5 megabits/s. This utilization might well be judged 
to be too small to be practical. If so, it might mean 
that the negligible packet loss must be sacrificed, which 
means that a QoS study at very low traffic bit rates 
needs to take account of TCP feedback. 

One outcome of the dependence of the bandwidth es- 
timation on the traffic statistics is that our solution for 
best-effort traffic would not apply to other forms of In- 
ternet traffic that do not share the best-effort statistical 
properties. One example is voice traffic. 

Finally, the best-effort delay model provides an es- 
timation of bandwidth in isolation without considering 
other network factors. A major factor in network de- 
sign is link failures. Redundancy needs to be built into 
the system. An estimate of bandwidth from the model 
for a link based on the normal link traffic may be re- 
duced to provide this redundancy. However, the model 
still plays a role because the bandwidth must be chosen 
based on link traffic, but now it is traffic in the event of 
a failure elsewhere. 
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